-
Notifications
You must be signed in to change notification settings - Fork 51
[WIP] BP5 Put performance optimization #1756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
[WIP] BP5 Put performance optimization #1756
Conversation
I checked it. Works!! |
04982dd
to
17c8621
Compare
engine.Put(var, ptr); | ||
auto do_defer = | ||
ba.m_is_bp5 ? adios2::Mode::Sync : adios2::Mode::Deferred; | ||
engine.Put(var, ptr, do_defer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This incurs an overhead in EndStep() / PerformDataWrite(). Use Async in those cases.
@pnorbert @eisenhauer can you take a look at this? @franzpoeschel and I wonder if this is a performance bug that rather should be fixed in ADIOS2. |
IMHO, PerformPuts was always kind of an odd thing. Semantically it's the equivalent of "Remember when I did put deferred earlier? Forget about that, I really meant put sync so copy the data now so I can reuse those buffers". BP4 is always going to copy the data into internal buffers at some point, so it didn't necessarily matter much if it happened in Put, PerformPuts, or EndStep (which itself calls PerformPuts in BP4). On the other hand, BP5 at least has the ability to not copy the data at all. I.E. if you do Put deferred, we just keep that pointer and EndStep can write the data from application memory directly to disk without any copies. Note that I say "can". For smaller data blocks BP5 copies at the time of Put whether you say deferred or sync. Some aggregators may also end up copying the data into a contiguous block. There are various other reasons why zero copy I/O might not happen, but at least for BP5 put deferred and not doing PerformPuts (ever) gives you a chance of zero copy with BP5. However, if you need to reuse buffers before EndStep, you've got to force the copy sometime. I'm not sure it would make much of a difference if you do that in Put sync or PerformPuts. |
17c8621
to
ce15b5f
Compare
ce15b5f
to
8a48b61
Compare
According to #1751, BP5 performance takes a hit by calling PerformPuts(). Instead, for BP5, always use Put(Sync), so that we can skip PerformPuts().
@guj Can you check if this really improves performance?